Symbolic Music Data Version 1.0

نویسنده

  • Christian Walder
چکیده

In this document, we introduce a new dataset designed for training machine learning models of symbolic music data. Five datasets are provided, one of which is from a newly collected corpus of 20K midi files. We describe our preprocessing and cleaning pipeline, which includes the exclusion of a number of files based on scores from a previously developed probabilistic machine learning model. We also define training, testing and validation splits for the new dataset, based on a clustering scheme which we also describe. Some simple histograms are included.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Song2Quartet: A System for Generating String Quartet Cover Songs from Polyphonic Audio of Popular Music

We present Song2Quartet, a system for generating string quartet versions of popular songs by combining probabilistic models estimated from a corpus of symbolic classical music with the target audio file of any song. Song2Quartet allows users to add novelty to listening experience of their favorite songs and gain familiarity with string quartets. Previous work in automatic arrangement of music o...

متن کامل

Towards Audio to Score Alignment in the Symbolic Domain

This paper presents a matrix factorization based feature for audio to score alignment. We show that in combination with dynamic time warping it can compete with chroma vectors, which are the probably most frequently used approach within the last years. A great benefit of the factorizationbased feature is its sparseness, which can be used in order to transform it into a symbolic representation. ...

متن کامل

Mirex 2013: Discovering Musical Patterns Using Audio Structural Segmentation Techniques

This extended abstract discusses our pattern discovery algorithm submitted to the MIREX 2013 Discovery of Repeated Themes & Sections task. This algorithm estimates the musical patterns by finding specific repetitions within a piece and applying certain perceptually inspired rules. Four different versions of the algorithm were submitted: two that take an audio track as an input (monophonic and p...

متن کامل

Pattern Recognition Algorithms for Polyphonic Music Transcription

The main area of work in computer music related to information systems is known as music information retrieval (MIR). Databases containing musical information can be classified into two main groups: those containing audio data (digitized music) and those that file symbolic data (digital music scores). The latter are much more abstract that the former ones and contain a lot of information alread...

متن کامل

A Multimodal Way of Experiencing and Exploring Music

Significant digitization efforts have resulted in large multimodal music collections, which comprise music-related documents of various types and formats including text, symbolic data, audio, image, and video. The challenge is to organize, understand, and search musical content in a robust, efficient, and intelligent manner. Key issues concern the development of methods for analysing, correlati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1606.02542  شماره 

صفحات  -

تاریخ انتشار 2016